Correcting Dependency Annotation Errors

نویسنده

  • Markus Dickinson
چکیده

Building on work detecting errors in dependency annotation, we set out to correct local dependency errors. To do this, we outline the properties of annotation errors that make the task challenging and their existence problematic for learning. For the task, we define a feature-based model that explicitly accounts for non-relations between words, and then use ambiguities from one model to constrain a second, more relaxed model. In this way, we are successfully able to correct many errors, in a way which is potentially applicable to dependency parsing more generally.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting and Correcting Errors in an English Tectogrammatical Annotation

We present our first experiments with detecting and correcting errors in a manual annotation of English texts, taken from the Penn Treebank, at the dependency-based tectogrammatical layer, as it is defined in the Prague Dependency Treebank. The main idea is that errors in the annotation usually result in an inconsistency, i. e. the state when a phenomenon is annotated in different ways at sever...

متن کامل

Automatic Error Detection in Annotated Corpora

Annotated corpus is a linguistic resource which explicitly encodes the information at syntactic and semantic levels for each sentence. Annotated corpora play a crucial role in many applications of natural language processing (NLP). Error free and consistent annotated corpora is vital for these applications. Creating annotated corpora is an expensive and time consuming process. Errors or anomali...

متن کامل

Detecting Syntactic Errors in Dependency Treebanks for Morphosyntactically Rich Languages

The paper introduces a new method for detecting and correcting errors in large dependency treebanks with rich morphosyntactic annotation. The technique uses error correction rules automatically extracted from the treebank. The procedure of rule extraction is based on a comparison of similar – but not identical – subgraphs of dependency structures. The outcome of applying the method to a 3-milli...

متن کامل

Indonesian Dependency Treebank: Annotation and Parsing

We introduce and describe ongoing work in our Indonesian dependency treebank. We described characteristics of the source data as well as describe our annotation guidelines for creating the dependency structures. Reported within are the results from the start of the Indonesian dependency treebank. We also show ensemble dependency parsing and self training approaches applicable to under-resourced...

متن کامل

Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie

This article presents the methods, results, and precision of the syntactic annotation process of the Rhapsodie Treebank of spoken French. The Rhapsodie Treebank is an 33,000 word corpus annotated for prosody and syntax, licensed in its entirety under Creative Commons. The syntactic annotation contains two levels: a macro-syntactic level, containing a segmentation into illocutionary units (inclu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009